Search CORE

153 research outputs found

Efficient pebbling for list traversal synopses

Author: Matias Yossi
Porat Ely
Publication venue
Publication date: 01/01/2002
Field of study

We show how to support efficient back traversal in a unidirectional list, using small memory and with essentially no slowdown in forward steps. Using

O(\log n)

memory for a list of size

n

, the

i

'th back-step from the farthest point reached so far takes

O(\log i)

time in the worst case, while the overhead per forward step is at most

\epsilon

for arbitrary small constant

\epsilon>0

. An arbitrary sequence of forward and back steps is allowed. A full trade-off between memory usage and time per back-step is presented:

k

vs.

kn^{1/k}

and vice versa. Our algorithms are based on a novel pebbling technique which moves pebbles on a virtual binary, or

t

-ary, tree that can only be traversed in a pre-order fashion. The compact data structures used by the pebbling algorithms, called list traversal synopses, extend to general directed graphs, and have other interesting applications, including memory efficient hash-chain implementation. Perhaps the most surprising application is in showing that for any program, arbitrary rollback steps can be efficiently supported with small overhead in memory, and marginal overhead in its ordinary execution. More concretely: Let

P

be a program that runs for at most

T

steps, using memory of size

M

. Then, at the cost of recording the input used by the program, and increasing the memory by a factor of

O(\log T)

O(M \log T)

, the program

P

can be extended to support an arbitrary sequence of forward execution and rollback steps: the

i

'th rollback step takes

O(\log i)

time in the worst case, while forward steps take O(1) time in the worst case, and

1+\epsilon

amortized time per step.Comment: 27 page

arXiv.org e-Print Archive

CiteSeerX

Efficient Bundle Sorting

Author: Matias Yossi
Segal Eran
Vitter Jeffrey Scott
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2006
Field of study

AMS subject classification. 68W01 DOI. 10.1137/S0097539704446554Many data sets to be sorted consist of a limited number of distinct keys. Sorting such data sets can be thought of as bundling together identical keys and having the bundles placed in order; we therefore denote this as bundle sorting. We describe an efficient algorithm for bundle sorting in external memory, which requires at most c(N/B) logM/B k disk accesses, where N is the number of keys, M is the size of internal memory, k is the number of distinct keys, B is the transfer block size, and 2 < c < 4. For moderately sized k, this bound circumvents the Θ((N/B) logM/B(N/B)) I/O lower bound known for general sorting. We show that our algorithm is optimal by proving a matching lower bound for bundle sorting. The improved running time of bundle sorting over general sorting can be significant in practice, as demonstrated by experimentation. An important feature of the new algorithm is that it is executed “in-place,” requiring no additional disk space

KU ScholarWorks

Efficient Bundle Sorting

Author: Matias Yossi
Segal Eran
Vitter Jeffrey Scott
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 20/11/2015
Field of study

This is the published version. Copyright © 2006 Society for Industrial and Applied MathematicsMany data sets to be sorted consist of a limited number of distinct keys. Sorting such data sets can be thought of as bundling together identical keys and having the bundles placed in order; we therefore denote this as bundle sorting. We describe an efficient algorithm for bundle sorting in external memory, which requires at most c(N/B) logM/Bk disk accesses, where N is the number of keys, M is the size of internal memory, k is the number of distinct keys, B is the transfer block size, and 2 < c < 4. For moderately sized k, this bound circumvents the Theta((N/B) logM/B (N/B)) I/O lower bound known for general sorting. We show that our algorithm is optimal by proving a matching lower bound for bundle sorting. The improved running time of bundle sorting over general sorting can be significant in practice, as demonstrated by experimentation. An important feature of the new algorithm is that it is executed "in-place," requiring no additional disk space

KU ScholarWorks

Approximate Data Structures with Applications

Author: Matias Yossi
Vitter Jeffrey Scott
Young Neal E.
Publication venue: 'The Japan Society for Industrial and Applied Mathematics'
Publication date: 01/01/1994
Field of study

In this paper we introduce the notion of approximate data structures, in which a small amount of error is tolerated in the output. Approximate data structures trade error of approximation for faster operation, leading to theoretical and practical speedups for a wide variety of algorithms. We give approximate variants of the van Emde Boas data structure, which support the same dynamic operations as the standard van Emde Boas data structure [28, 201, except that answers to queries are approximate. The variants support all operations in constant time provided the error of approximation is l/polylog(n), and in O(loglog n) time provided the error is l/polynomial(n), for n elements in the data structure. We consider the tolerance of prototypical algorithms to approximate data structures. We study in particular Prim’s minimumspanning tree algorithm, Dijkstra’s single-source shortest paths algorithm, and an on-line variant of Graham’s convex hull algorithm. To obtain output which approximates the desired output with the error of approximation tending to zero, Prim’s algorithm requires only linear time, Dijkstra’s algorithm requires O(mloglogn) time, and the on-line variant of Graham’s algorithm requires constant amortized time per operation

KU ScholarWorks

eScholarship - University of California

Dynamic Generation of Discrete Random Variates

Author: Matias Yossi
Ni Wen-Chun
Vitter Jeffrey Scott
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/03/2011
Field of study

The original publication is available at www.springerlink.comWe present and analyze efficient new algorithms for generating a random variate distributed according to a dynamically changing set of N weights. The base version of each algorithm generates the discrete random variate in O(log N) expected time and updates a weight in O(2log N) expected time in the worst case. We then show how to reduce the update time to O(log N) amortized expected time. We nally show how to apply our techniques to a lookup-table technique in order to obtain expected constant time in the worst case for generation and update. We give parallel algorithms for parallel generation and update having optimal processor-time product. Besides the usual application in computer simulation, our method can be used to perform constant-time prediction in prefetching applications. We also apply our techniques to obtain an eÆcient dynamic algorithm for maintaining an approximate heap of N elements, in which each query is required to return an element whose value is within an multiplicative factor of the maximal element value. For = 1=polylog(N), each query, insertion, or deletion takes O(log log logN) time

KU ScholarWorks

Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors

Author: Guy E. Blelloch
Marco Zagha
Phillip B. Gibbons
Yossi Matias
Publication venue
Publication date: 01/01/1995
Field of study

This paper considers issues of memory performance in shared memory multiprocessors that provide a high-bandwidth network and in which the memory banks are slower than the processors. We are concerned with the effects of memory bank contention, memory bank delay, and the bank expansion factor (the ratio of number of banks to number of processors) on performance, particularly for irregular memory access patterns. This work was motivated by observed discrepancies between predicted and actual performance in a number of irregular algorithms implemented for the cray C90 when the memory contention at a particular location is high. We develop a formal framework for studying memory bank contention and delay, and show several results, both experimental and theoretical. We first show experimentally that our framework is a good predictor of performance on the cray C90 and J90, providing a good accounting of bank contention and delay. Second, we show that it often improves performance to have addi..

CiteSeerX

Crossref